Translation Disambiguation Using Bilingual Bootstrapping
نویسندگان
چکیده
This article proposes a new method for word translation disambiguation, one that uses a machinelearning technique called bilingual bootstrapping. In learning to disambiguate words to be translated, bilingual bootstrapping makes use of a small amount of classified data and a large amount of unclassified data in both the source and the target languages. It repeatedly constructs classifiers in the two languages in parallel and boosts the performance of the classifiers by classifying unclassified data in the two languages and by exchanging information regarding classified data between the two languages. Experimental results indicate that word translation disambiguation based on bilingual bootstrapping consistently and significantly outperforms existing methods that are based on monolingual bootstrapping.
منابع مشابه
Word Translation Disambiguation Using Bilingual Bootstrapping
This article proposes a new method for word translation disambiguation, one that uses a machinelearning technique called bilingual bootstrapping. In learning to disambiguate words to be translated, bilingual bootstrapping makes use of a small amount of classified data and a large amount of unclassified data in both the source and the target languages. It repeatedly constructs classifiers in the...
متن کاملTogether We Can: Bilingual Bootstrapping for WSD
Recent work on bilingual Word Sense Disambiguation (WSD) has shown that a resource deprived language (L1) can benefit from the annotation work done in a resource rich language (L2) via parameter projection. However, this method assumes the presence of sufficient annotated data in one resource rich language which may not always be possible. Instead, we focus on the situation where there are two ...
متن کاملAugmenting a Bilingual Lexicon with Information for Word Translation Disambiguation
We describe a method for augmenting a bilingual lexicon with additional information for selecting an appropriate translation word. For each word in the source language, we calculate a correlation matrix of its association words versus its translation candidates. We estimate the degree of correlation by using comparable corpora based on these assumptions: “parallel word associations” and “one se...
متن کاملQuery Translation using Wikipedia-based resources for analysis and disambiguation
This work investigates query translation using only Wikipedia-based resources in a two step approach: analysis and disambiguation. After arguing that data mined from Wikipedia is particularly relevant to query translation, both from a lexical and a semantic perspective, we detail the implementation of the approach. In the analysis phase, lexical units are extracted from queries and associated t...
متن کاملA Word Sense Disambiguation Method Using Bilingual Corpus
This paper proposes a word sense disambiguation (WSD) method using bilingual corpus in English-Chinese machine translation system. A mathematical model is constructed to disambiguate word in terms of context phrasal collocation. A rules learning algorithm is proposed, and an application algorithm of the learned rules is also provided, which can increase the recall ratio. Finally, an analysis is...
متن کامل